On November 2016, the world witnessed yet another important election from one of the major economic superpowers: the United States. The results left many surprised.
We obtained a dataset from Kaggle that has 34 columns and 397629 rows. Each row represents a tweet on the election day according from the earliest timestamp to the latest. A brief description of the features:
| Variable | Description |
|---|---|
| text | text of the tweet |
| created_at | date and time of the tweet (format yyyy-mm–dd hh:mm:ss) |
| geo | a JSON object containing coordinates [latitude, longitude] and a “type” |
| lang | Twitter’s guess as to the language of the tweet |
| place | a Place object from the Twitter API |
| coordinates | a JSON object containing coordinates [longitude, latitude] and a `type’; note that coordinates are reversed from the geofield |
| user.favourites.count | number of tweets the user has favorited |
| user.statuses_count | number of statuses the user has posted |
| user.description | the text of the user’s profile description |
| user.location | text of the user’s profile location |
| user.id | unique id for the user |
| user.created_at | when the user created their account |
| user.verified | bool; is user verified? |
| user.following | bool; am I (Ed King - the data creator) following this user? |
| user.url | the URL that the user listed in their profile (not necessarily a link to their Twitter profile) |
| user.listed_count | number of lists this user is on (?) |
| user.followers_count | number of accounts that follow this user |
| user.default_profile_image | bool; does the user use the default profile pic? |
| user.utc_offset | positive or negative distance from UTC, in seconds |
| user.friends_count | number of accounts this user follows |
| user.default_profile | bool; does the user use the default profile? |
| user.name | user’s profile name |
| user.lang | user’s default language |
| user.screen_name | user’s account name |
| user.geo_enabled | bool; does user have geo enabled? |
| user.profile_background_color | user’s profile background color, as hex in format “RRGGBB” (no ‘#’) |
| user.profile_image_url | a link to the user’s profile pic |
| user.time_zone | full name of the user’s time zone |
| id | unique tweet ID |
| favorite_count | number of times the tweet has been favorited |
| retweeted | bool; is this a retweet? |
| source | if a link, where is it from (e.g., “Instagram”) |
| favorited | have I (Ed King - data creator) favorited this tweet? |
| retweet_count | number of times this tweet has been retweeted |
plot_ly(x = 1:61, y = 1:87, z = volcano, type = "heatmap")
set.seed(123)
x = 1:100
y1 = 2*x + rnorm(100)
y2 = -2*x + rnorm(100)
axis_template = list(
showgrid = F,
zeroline = F,
nticks = 20,
showline = T,
title = "AXIS",
mirror = "all")
plot_ly(
x = x,
y = y1,
type = "scatter") %>%
layout(
xaxis = axis_template,
yaxis = axis_template)
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
plot_ly( x = c(1, 2, 3),
y = c(5, 6, 4),
type = "scatter",
mode = "markers")
plot_ly(x = c(1, 2, 4),
y = c(2, 3, 2),
type = "scatter",
mode = "markers",
marker = list(color = c("green", "blue", "red")))
plot_ly(x = x,
y = y1,
type = "scatter",
mode = "markers") %>%
add_trace( x = x,
y = y2) %>%
layout(
legend = list(x = 0.5,
y = 1,
bgcolor = "#F3F3F3")
)